Finding the Best-Fit Bounding-Boxes

نویسندگان

  • Bo Yuan
  • Leong Kwoh
  • Chew Lim Tan
چکیده

The bounding-box of a geometric shape in 2D is the rectangle with the smallest area in a given orientation (usually upright) that complete contains the shape. The best-fit bounding-box is the smallest bounding-box among all the possible orientations for the same shape. In the context of document image analysis, the shapes can be characters (individual components) or paragraphs (component groups). This paper presents a search algorithm for the best-fit bounding-boxes of the textual component groups, whose shape are customarily rectangular in almost all languages. One of the applications of the best-fit bounding-boxes is the skew estimation from the text blocks in document images. This approach is capable of multi-skew estimation and location, as well as being able to process documents with sparse text regions. The University of Washington English Document Image Database (UW-I) is used to verify the skew estimation method directly and the proposed best-fit bounding-boxes algorithm indirectly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Finding Optimized Bounding Boxes of Polytopes in D-dimensional Space and Their Properties in K-dimensional Projections

FINDING OPTIMIZED BOUNDING BOXES OF POLYTOPES IN D-DIMENSIONAL SPACE AND THEIR PROPERTIES IN K-DIMENSIONAL PROJECTIONS

متن کامل

Dynamic Collision Detection using Oriented Bounding Boxes

2 Oriented Bounding Boxes 5 2.1 Separation of OBBs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Testing for Intersection of OBBs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.1 Stationary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.2 Constant Velocities . . . . . . . . . . . . ...

متن کامل

Bounds on the quality of the PCA bounding boxes

Principal component analysis (PCA) is commonly used to compute a bounding box of a point set in R. The popularity of this heuristic lies in its speed, easy implementation and in the fact that usually, PCA bounding boxes quite well approximate the minimumvolume bounding boxes. We present examples of discrete points sets in the plane, showing that the worst case ratio of the volume of the PCA bou...

متن کامل

Optimal Packing of High-Precision Rectangles

The rectangle-packing problem consists of finding an enclosing rectangle of smallest area that can contain a given set of rectangles without overlap. Our new benchmark includes rectangles of successively higher precision, challenging the previous state-of-the-art, which enumerates all locations for placing rectangles, as well as all bounding box widths and heights up to the optimal box. We inst...

متن کامل

Extraction of text lines and text blocks on document images based on statistical modeling

In this article, we developed a Bayesian model to characterize text line and text block structures on document images using the text word bounding boxes. We posed the extraction problem as finding the text lines and text blocks that maximize the Bayesian probability of the text lines and text blocks given the text word bounding boxes. In particular, we derived the so-called probabilistic linear...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006